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worms  present.  If      .1  hosts  a  .... 

....  each,  a  frequency  distribution  of  the  counts,  i.e.,  i.  is 

with  0,1,2,...  worms  of  a  particular  species,  or  of  all  . 

be  calculated.   It  nay  then  be  possible  to  determine  if  this  observed  .  ..cy 

distribution  follows  some  theoretical  statistical  distribution.   If  it  doc  , 
then  the  probability  of  a  particular  worm  count  occurring  in  any  sample  taken 
from  this  population  can  be  determined.   The  sample  estimates  of  the 
eters  of  the  theoretical  distribution  can   also  be  calculated  from  sample 
counts  adding  to  existing  knowledge  of  the  observed  distribution. 

If  sacrificing  the  host  to  obtain  worm  counts  is  undesirable,  then 
parasite  egg  counts,  obtained  by  taking  subsamples  of  faecal  samples  col- 
lected from  the  infected  host,  may  be  used  to  estimate  the  true  vo 
of  the  host.   Just  as  in  the  case  of  worm  counts,  a  frequency  distribution 
can  then  be  constructed  based  on  the  observed  number  of  faecal  subsa...o!es 
with  0,1,2,...  eggs  present. 

This  report  is  concerned  with  the  most  popular  theoretical  distributions 
used  by  various  workers  in  the  past  to  fit  the  observed  egg  counts  under 
specified  conditions. 

Peters  and  Leiper  (1940),  using  two  different  egg  counting  tec .  ...     , 
concluded  that  the  distribution  of  successive  counts  of  Haemonchus  cont-  : 


eggs,  when  these  counts  were  all  from  the  sane  suspension  of  eggs,  was  ap- 
proximately the  Poisson.   Peters  (1941)  presents  an  interesting  discussion 
of  dilution  egg  counts  and  gives  three  examples  showing  that  the  variance 
or  mean  square  of  these  counts  approximates  their  respective  means.   He  con- 
cludes that  the  Poisson  series  is  applicable.   Emik  (1947)  concluded  (using 
a  dilution  counting  technique)  that  nematode  egg  counts  from  12  hetero- 
geneous lambs  were  Poisson  distributed.   Brambell  (1963)  confirms  Peters  (1941) 
position  by  taking  egg  counts  from  housed  sheep.   Hunter  and  Quenouille  (1952) 
took  four  faecal  samples  from  each  of  132  sheep  and  determined  by  a  chi-square 
test  (discussed  in  the  next  section)  that  the  distribution  of  the  replicate 
worm-egg  counts  in  each  sheep  fitted  the  well-known  Poisson  distribution. 
They  performed  other  trials  that  gave  similar  results,  i.e.,  Poisson  distri- 
buted counts;  and  in  addition,  the  over-all  chi-square  test  (i.e.,  the  chi- 
square  value  over  all  trials)  also  indicated  the  Poisson  distribution.   When 
Hunter  and  Quenouille  (1952)  investigated  13  series  of  egg  counts  from  dif- 
ferent sheep  however,  they  found  that  the  negative  binomial  distribution 
(also  known  as  the  binomial  waiting-time  distribution  or  the  Pascal  distri- 
bution, (Wilks,  1962))  was  appropriate. 

If  the  assumption  is  made  that  egg  counts  are  accurate  estimates  of  the 
worm  population,  then  it  becomes  important  to  investigate  what  distributions 
have  been  found  to  describe  worm  counts  themselves.   As  regards  this  as- 
sumption it  is  interesting  to  note  that  Uillmott  and  Pester  (1552)  performed 
an  experiment  in  which  they  tried  to  ascertain  within  what  limits  egg  counts 
were  reliable  criteria  on  which  to  base  estimates  of  the  number  of  flukes 
(paramph is tomes)  in  the  host.   They  admit  that  the  number  of  observations  was 
too  few  to  permit  definite  conclusions,  but  basec  on  their  observations  of 
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.  that  the  frequency  distribution  of  woi 

lickena  followed  the  negative  binomial  distribui. 
en  (1955)  found  that  the  negative  blnoi  z'.-^   counts  oi    .  .  -  . 

alii  which  they  obtained  while  working  on  the  problem  of  Immunity  and 
tolerance  of  chickens  to  __.   .Hi.   Li  and  Hsu  (1951)  studied  15  8] 
of   parasitic  nematodes,  3  species  of  cestodes,  and  5  species  of  trer.-.utodes, 
and  found  that  the  frequency  distribution  of  tne  counts  of  parasites  found 
in  their  natural  host  (of  various  types)  had  the  characteristics  of  beii 
skewed  in  the  positive  direction  and  similar  in  appearance.   They  hypothesized 
that  I.  of  the  Pearsonian  frequency  curves  would  probably  fit  r.ost  of  the 
curves.   Although  North am  and  Rocha  (1958)  do  not  discount  this  conclusion, 
they  point  out  that  the  negative  binomial  distribution  would  prooaoly  fit  the 
data  as  well,  and  it  would  have  the  advantage  of  being  discrete  just  as  the 
data  are  discrete. 

From  the  conclusions  of  these  various  investigators,  it  appears  that 
both  egg  counts  and  the  underlying  worm  population  can  be  described  by  C 
Poisson  or  the  negative  binomial  distributions,  depending  on  the  samplin 
procedure.   This  report  is  therefore  centered  around  these  two  distributions; 
their  derivations,  properties,  and  applicaDility  to  egg  count  studies. 


tile:  poisson  distribution 

Derivation 

The  Poisson  distribution  was  discovered  by  Poisson  in  1337.   Bortkewetsch 
later  expanded  Poisson' s  work  by  illustration.   This  series  was  independently 
discovered  in  1907  (Whitaker,  1914)  by  "Student"  (1907)  in  his  paper  entitled 
"On  The  Error  of  Counting  With  a  Ilaemacy  tometer" ,  in  which  he  showed  that 
the  distribution  of  small  particles  in  a  liquid  followed  the  Poisson  law 

e"ra  [i  +  m  +  fr  +  .  .  .  +  si  +  .  .  .  ]    =  £ls!L  (1) 

where  m  is  the  arithmetic  mean  number  of  particles  per  unit  volume,  e  is 
the  base  of  the  natural  logarithm  and  is  approximately  equal  to  2.718,  and 
the  successive  terms  in  the  series  gives  the  probability  that  a  given  unit 
volume  contains  0,1,2, ... ,r,. . .  particles. 

The  Poisson  distribution  can  be  derived  in  the  following  manner  (Student, 
1907;  Ostle,  1963).   Suppose  that  a  liquid  suspension  of  helminth  eggs  ob- 
tained from  a  faecal  sample  is  thoroughly  mixed  and  spread  evenly  over  a 
surface  marked  off  into  N  equal  units  of  area.   Suppose  further  that  each  area 
has  an  average  of  m  eggs  contained  in  it,  resulting  in  a  total  of  Km  eggs 
throughout  the  whole  suspension.   If  the  suspension  has  been  thoroughly  mixed, 

a  given  egg  will  have  an  equal  probability  (— )  of  falling  on  any  one  of  the 

unit  areas,  and  an  equal  probability  (1  -  — )  of  not  falling  on  a  particular 

unit  area.   Similarly,  the  probability  that  a  given  unit  area  will  contain 

m     ,  .  .   m 

an  egg  is  —  ,  and  not  contain  an  egg  is  1  -  —  . 

It  must  be  assumed  that  each  unit  area  has  the  capacity  to  hold  any 

number  of  eggs  without  affecting  the  probability  of  still  more  eggs  falling 
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.^..  is  simply  Che  expansion 
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(3) 


Equation  (2)  is  usually  written  as 


,n.   x  n-x 
(x)  P   q     , 

1  ,    1 

witn  n  =  mN.  p  ■  — ,  ana  q  =  1  -  — 


x  =  0,1,2, .. . 


(A) 


where   it   should  be  noted  that  as  N  ■+  ~,    the   quantity  —  ■*  0. 

t-  v> 
The    (x  +  1)        tern  in   the  expansion  of    (2)    is 


x  r.L.-x 

[^j]  nN[mN-l][mN-2]    .    .    .    [mli-x+l]  [M    [1-  A] 

1   in        1       v 

Letting  Nm  =  n,  it  is  evident  that  —  =  —  and  1  -  —  =  1 . 

n        N      n 


Therefore,  (5)  becomes 

n(n-i)(n-2).  .  .  (n-x+1) 
x! 
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Dividing  both  the  numerator  and  denominator  by  n  ,  it  is  seen  that  (6) 
reduces  to 


(5) 


(6) 
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(7) 


Taking  the  limit  of  this  expression  as  n  -*•  <°   (which  implies  that  mN  ■+■  °» 
which  in  turn  implies  that  N  -*■  °°,  since  m  is  considered  fixed)  it  is  seen  that 
(7)  becomes 


x 
m   —m 

,,i  e   »  x  -  0,1,2 , . .  .  (d) 


which  is  the  probability  function  of  the  Poisson  distribution. 

From  this  derivation,  it  is  apparent  that  the  Poisson  distribution  is  a 
limiting  form  of  the  binomial  distribution  given  by  equation  (4)  where 
N  ->  «>  ,  resulting  in  —  ■*■   0  while  the  mean  number  of  particles  per  unit  area 
(m)  remains  constant;  or  as  Peters  (1941)  puts  it,  "The  Poisson  series  is 
simply  the  binomial  series  pushed  to  the  limit  where  p  is  indefinitely  sraall, 
q  is  near  unity,  and  n  is  so  large  that  the  mean,  np,  is  an  appreciable 
quantity." 

It  is  not  difficult  to  justify  the  premise  that  dilution  egg  counts  fol- 
low the  Poisson  distributioxi  (Peters,  1941).   Dilution  egg  counts  are  usually 
performed  using  the  licllaster  slide  technique  in  which  the  eggs  are  counted 
that  lie  under  a  centimeter  square  engraved  on  the  fixed  coverglass  which  is 
supported  0.15  cm.  above  a  slide  by  0.15  cc.  of  egg  suspension.   Because  the 
faecal  suspension  is  a  half-saturated  solution  of  salt,  the  eggs  rise  to  the 
coverglass  making  them  easy  to  count. 

Peters  (1941)  estimates  the  average  volume  of  7  common  sheep  nematode 

3 
eggs  to  be  0.0002  mm  ,  where  he  is  considering  the  average  egg  to  be  a  square 

3 
prism  of  dimensions  90y  by  45u.   Since  150  mm  is  the  volume  of  the  suspension 

under  the  centimeter  square,  Peters  calculates  that  there  is  room  for 

750,000  eggs  closely  packed  in  the  available  space.   If  the  mean  number  of 

eggs  in  each  square  centimeter  is  100,  then  p,  the  probability  that  any  unit 
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Properties 
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bution  is  known  to  approach  the  normal  distribution  as  m  becomes  very  lai 

.s  can  be  shown  to  be  true  by  investigating  tne  distribution  statistics  a 


and  a  ,  where 
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a 

....ere  u.  and  u.  are  the  third  and  fourth  moments  about  the      of  the  di^- 
b      4 

tribution,  respectively,  and  o3  and  ak   are  the  third  and  fourth      r  of  l 

standard  deviation  of  the  distribution.   The  statistic  a  is  a 

skewness,  and  the  normal  distribution  is  always  equal  to  zero  cue  to  - 

symmetry  of  the  normal  curve.   A  measure  of  kurtosis,  i.e.,  a  reasure  of 

whether  the  distribution  is  more  peaked  or  flat- topped  than  the  normal  curve, 

is  given  by  the  statistic  a.  ana  is  always  equal  to  three  in  the  normal  cis- 

4 

tribution.   It  can  be  shown  that  for  any  point  binomial  distribution  found 


by  evaluating  (q  +  p)  ,  the  values  for  a„  and  a.  are 

3      4 

_    n-p  1     6  ,  . 

a0  =  — — —  ,      a  = +  3  . 

3   ^pT         ^   npq   n 

Nov;  if  p  is  assumed  to  be  very,  very,  small  (which  results  in  q 
approaching  1) ,  and  n  to  be  very  large  so  that  np  is  an  appreciable 
quantity,  then  a  and  a,  become  appropriate  statistics  for  the  Poisson  dis- 
tribution.  In  such  a  distribution  we  have 

a_  =  —  ,  and  a.  =  3  +  — 
3    /—        4       m 
vm 

Thus,  if  m  (the  arithmetic  mean  of  a  Poisson  distribution)  is  large,  a  -»■  0 

and  a  -*■   3,  which  implies  the  Poisson  distribution  approaches  the  normal 
distribution  (Waugh,  1943). 

Equality  of  the  ileau  and  Variance.   One  of  the  most  useful  properties 
of  the  Poisson  distribution  is  the  equality  of  the  mean  and  variance,  which 
can  be  shown  in  the  following  manner: 

The  moment  generating  function  of  a  random  variable  X  with  a  Poisson 
distribution  is  <J>(6)  where 
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It  is  concludes  that 

..)  =:.:  =  . 


and 

E(X")  -  (E(X))"  =  i  -  -■  i  -  ra 


2 
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=  r.  . 
Tac  Poisson  distribution  is  therefore  completely  specified  by  the  one 
parameter 

Transformations,  To  make  the  v.. .  lent  (where  the 

observations  are  Poisson  distributed)  ,  Bartlett  (1947)  recoi  ire 

root  transformation,  Sx,   or  for  very  small  numbers  the  transformation 
/x  +  1/2  . 

ik  (1947)  usee!  the  square  root  transformation  on  ere.    .  count 
tained  on  each  of  twelve  sheep  where  the  counts  were  Poisson  dist 
For  each  sheep  he  calculated  the  mean  of  the  transformed  com  . 
the  mean  and  variance  were  no  longer  signific      correl  Lted, 
formed  an  analysis  of  variance  on  tr.ejc  transformed  com  t 
tests  for  significance.   He  was  able  to  make  these  tests  onl 
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transformation  was  made,  causing  the  means  to  become  independent  of  the 
variances  and  the  counts  somewhat  more  normal. 

Testing  Hypotheses 

Given  that  an  experimenter  has  taken  faecal  samples  and  made  numerous 
dilution  counts  from  these  samples  where  the  number  of  eggs  per  count  ranged 
from  zero  to  n,  he  is  usually  interested  in  testing  some  hypothesis  about 
these  counts.   A  common  hypothesis  tested,  is  that  the  counts  follow  the 
Poisson  distribution.   If  the  experimenter  assumes  this  hypothesis  to  be  true 
he  can  find  the  expected  number  of  counts  with  0,l,2,...,n  eggs  per  count  by 
evaluating  the  probability  function  of  the  Poisson  distribution  for 
x  =  0,l,2,...,n  and  for  m  calculated  using  his  sample  counts.   The  prob- 
ability function  can  be  evaluated  by  first  calculating  equation  (3)  for 
x  =  0,  then  use  the  relationship 

-m  x+1      -m  x 

e   m        e   m       m 

(x+l)  !        x!       x+1 
for  x  =  l,2,...,n.   The  probability  function  for  each  value  of  x  is  then 
multiplied  by  N,  the  total  number  of  counts  taken,  to  obtain  the  expected 
frequency  of  counts  for  x  =  Q,l,2,...,n.   Now  the  experimenter  is  in  need 
of  some  statistical  test  that  will  help  him  make  a  decision  as  to  whether  his 
observed  counts  differ  significantly  from  the  calculated  expected  counts, 
resulting  in  a  decision  to  either  accept  or  reject  the  hypothesis  that  the 
counts  are  distributed  according  to  the  Poisson  distribution. 

Probably  the  most  commonly  used  test  is  the  chi-square  test  developed 
by  Karl  Pearson  in  1899.   He  developed  the  x2  statistic,  where 


x2  -  I     CO,  -  E  )'/..     . 
i-1 

0.  ■  the  number  of  egg  co...       rved  Li      I    cl.. 
class  is  one  unit  ir.  length 
and 

E .  ■  tl    .  jcr  oi  egg  counts  expected  in      "  class  under  t 
hypothesis  of  Poisson  distributee  cour. I  . 
Under  these  conditions  x2  aas  approximately  the  chi-square  d      ution 
with  n  -  p  -  1  degrees  of  freedom,  where  the   Dumber  of  class  intervals  used 
in  fitting  the  distribution  is  n,  and  p  is  the  number  of  parameters  in  the 
distribution.   Since  the  Poisson  distribution  is  being  fitted,  the  appropriate 
degrees  of  freedom  are  (n  -  2).   Equation  (9)  can  thus  be  used  to  test  the 
hypothesis  that  the  observed  egg  counts  follow  the  Poisson  distribution 
(Ostle,  1963). 

Instead  of  calculating  the  expected  number  of  counts,  it  is  a  common 
procedure  to  use  the  fact  that  for  all  Poisson  series,  the  variance  is 
numerically  equal  to  the  mean  and  to  use  a  slightly  different  form  of  the 
chi-square  statistic  than  given  in  equation  (9).   Since 

I      (x.  -  x)2/  o2 
i=l   1 

has  the  chi-square  distribution  with  (n  -  1)  degrees  of  freedom  (Fisher, 

1954),  for  true  samples  from  a  Poisson  distribution, 


n 
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i-1 
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is  approximately  distributed  as  a  chi-square  with  (n  -  1)  degrees  of  freedom, 
where 

x.  =  the  number  of  eggs  observed  in  the  i   count, 
n  =  the  total  number  of  counts  taken, 
and 

m  =  the  mean  number  of  eggs  per  count. 
The  reader  should  note  that  equation  (9)  is  used  when  working  with  frequencies 
of  egg  counts,  whereas  equation  (10)  is  applicable  when  working  with  the 
number  of  eggs  per  count.  When  using  equation  (10)  the  hypothesis  being 
tested  is  identical  to  the  hypothesis  underlying  equation  (9) . 

It  is  also  of  importance  to  note  that  the  sum  of  k  independent  chi- 
squares  is  a  chi-square  with  degrees  of  freedom  equal  to  the  sum  of  the 
degrees  of  freedom  of  each  individual  x2»   The  resulting  x2   test  is 
sensitive  and  will  often  show  discrepancies  that  were  not  apparent  in  the 
separate  x2  values. 

If  many  values  of  x2  are  available  for  testing  (all  with  the  same 
degrees  of  freedom),  it  is  often  aavisable  to  distribute  the  various  x2 
values  into  classes  bounded  by  values  given  in  the  chi-square  table  (de- 
pending, of  course,  on  the  degrees  of  freedom  with  which  the  table  was 
entered)  as  Brambell  (1963)  does  with  chi-squares  calculated  from  egg   counts. 
The  expected  frequency  of  these  classes  can  be  obtained  directly  from  the 

chi-square  table.   Thus  a  x2   test  may  be  performed  using  equation  (9)  where 

9  th 

0.  =  the  observed  frequency  of  occurrence  of  x   values  in  tne  i 

class  interval, 

and 

9  th 

E.  =  the  expected  frequency  of  occurrence  of  x   values  in  tne  x 

class  as  taken  from  the  chi-square  table. 
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P  ■  the  probability  that  a  x2  value  will  be  observe     t  is  I 
calculated   X"   value  fr 
of  Poisson  distribution  counts  is  true. 
To   every  value  of   x2   there  thus  corresponds  some  value  of  P.   The  question 
now  arises:   What  values  of  P  would  indicate  the  hypothesis  shoulc  be  ac- 
cepted?  Fisher  (1954)  states  that  a  value  of  P  betv/een  0.1  and  0.9  woi 
certainly  indicate  that  the  hypothesis  should  not  be  rejectee. 

If  an  experimenter  desires  to  test 

H, :  m  <  m 

1  —  o 

:  m  >  m 

2  o 

for  a  particular  series  of  counts,  this  nay  be  accomplished  by  using  a  cable 
of  the  cumulative  Poisson  distribution.   In  this  table  are  values  of  F(x)  where 

F(x  )  =  P(x  <  x  ) 
o       —  o 

V   e   n 
x=0    X" 

for  various  values  of  m  and  x.   The  procedure  for  a  sample  of  size  one,  is 

to  calculate 

P  =  1  -  F(x  -  1)  (12) 
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where  F(x)  is  read  from  the  above  mentioned  table  assuming  m  =   m  .   If 

o 

P  j5_  a,  where  a  is  a  preassigned  level  of  significance,  E.  is  rejected  and 

J. 

H„  accepted  (Ostle,  1963).  For  samples  of  size  two  or  larger,  the  sample 
sum  z,  replaces  x  in  equation  (12)  were  z  has  the  Poisson  distribution 
do  to  the  following  theorem  from  Wilks  (1962),  page  206: 

If  (x.  ,x_ ,  •  •  •  ,x  )  is  a  sample  from  the  Poisson  distribution  Po(m), 

then  the  samplying  distribution  of  the  sample  sum,  say  z,  is  distributed 

as  a  Po (nm) . 
The  quantity  P  =  1  -  F(z-l)  is  then  determined  from  the  cumulative  Poisson 
tables  and  II  is  accepted  or  rejected  on  the  basis  of  the  size  of  P  in  re- 
lation to  a. 

As  an  example,  suppose  4  Licmaster  slide  counts  have  been  made,  resulting 
in  the  following  egg  counts;  3,  0,  5,  and  8.   Let  the  null  and  alternative 
hypothesis  be 

H  :  m  <_  2 

K  :  m  >  2, 
which  is  testing  the  hypothesis  that  the  average  number  of  eggs  per  count 

is  2,  and  let  a  be  0.01.   Since  n  =  4  and  m  =2,  from  the  above  theorem  it 

'  o 

is  seen  that  z  is  distributed  as  Po(8).   Using  the  cumulative  Poisson  tables 
in  Ostle  (1963)  and  equation  (12)  with  z  in  place  of  x,  it  is  seen  that 

P  =  1  -  F(16  -  1)  =  1  -  F(15)  =  1  -  0.992 
=  0.008 
Since  P  <  0.01,  the  R.    is  rejected  and  it  is  concluded  that  m  >  2. 

Often  an  experimenter  wishes  to  test  for  significant  differences  between 
means  of  two  or  more  Poisson  series.   Suppose  4  series  of  counts  are  known 


... 


x2  -i  I    (s1  -  h2  d3) 


.  L-l 

.  .  .  -  ■ .......  , 

.  re  d  is  the  number  of  distributions  I  .    f   x2 

it  is  condu      .at  the  i.'.eans,  m .  ,  oantly 

.  I  .  ution  caiinot  adequately  fit  all  4  series  of  count.,  (Snedecor, 
e  234,  1956;  Ostle,  page  125,  1963). 

If  the  number  of  counts  taken  is  large  so  that  the  total  number  o. 
counted  is  large,  it  is  appropriate  to  test  for  differences  between  2  Pois. 
distributions  by  means  of  "Student's"  t  uest,  where 

(x  -  x  )  -  (y  -  y  ) 
t  =  -^ —  (14) 


ff 


x  —  x~ 
1   2 


(aL  -  x2)  -  (y  -  y_) 


Xl    X2 
Under  the  hypothesis  that  y_  =  u  ,  (15)  reduces  to 


(15) 


t  =  — * —  do) 


*n  - 

J. 

x„ 

/s2- 

0 

S 

xl 

X2 
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As  an  example,  suppose  100  counts  were  taken  resulting  in  300  eggs  being 
counted  over  all  counts.   If  the  entire  sample  of  100  counts  is  considered 
as  a  unit  with  300  eggs,  it  may  be  thought  of  as  a  single  sample  drawn  from 
a  population  whose  mean  is  estimated  at  300  and  whose  variance  is  therefore 
300  (assuming  the  counts  are  Poisson  distributed).   Therefore,  since  Poisson 
populations  with  such  large  means  are  approximately  normal  in  distribution, 
this  sample  may  be  considered  as  drawn  from  a  normal  population  whose  mean 


is  estimated  by  x  =  300  with  standard  error  s  =  /300  .   If  another  series 

x 

of  counts  is  taken,  say  75  counts  resulting  in  275  eggs,  the  difference 
between  these  two  distributions  may  be  tested  by  equation  (16).   That  is 

._  300  -  275 


/300  +275 
25 


24 

*  1.042 
which  would  indicate  that  the  two  series  of  counts  probably  are  from  the 
same  population,  i.e.,  the  same  Poisson  distribution  (Snedecor,  p.  437,  1956) 

Egg  Count  Studies 

As  was  mentioned  in  the  introduction,  Peters  and  Leiper  (1940)  did  a 
study  using  sheep  in  which  they  investigated  the  variation  among  successive 
dilution  egg  counts  from  the  same  suspension  of  eggs.   They  proposed  two 
questions: 

1)  What  is  the  form  of  such  a  distribution? 

2)  What  is  the  relationship  between  the  mean  and  variance  of  this 
distribution? 


I 

__. 

0.-  . .  r. 

.... 
1  a  Mc         tte.  Ele\       ,   of  c* 

^rscood  to  be  all  of  the      present  on  a  slide  cont..      0.15  ml.  of 

ach  juries  consists  of  25  counts.   All  counts  in  a 

.  ies  w<  .    ie  £xc   Le  volume  of  200  ml.  of  a  suspension  i 

atration  i     s  was  different  from  scries  to  s.  i 

Since  each  series  had  its  ov;n  mean  and  variance,  and  since  the  authors 
were  interested  in  the  overall  distribution  of  the  275  counts,  the  tr  -  - 
formation 

x  -  x 

y  =  -T~ 

applied  to  each  series,  where  x  and  s   varied  accc      to  t..^  series  o: 
counts  in  which  the  x  variate  appeared.       ean  anc  standard  deviation 
of  the  new  compounded  distribution  of  275  trans formea  counts  were 
0.9894  respectively.   To  show  that  the  transfomed  counts  were  in  appro;. 
agreement  with  the  norr.al  curve,  first  it  was  shown  that  the  distribution 
was  roughly  normal  in  shape.   This  was  done  by  simply  superimpose 
curve  over  the  histogram  made  from  the  transformed  counts  bein  to 

10  intervals  of  length  y  =  0.5C  and  observing  the  visual  ;_oodnes.>      I  . 
Secondly,  the  chi-square  test  was  perfor ned,  i.e.,  the  oosui-.  .ies 

in  each  interval  were  compared  with  the  frecuencies  expect 
hypothesis  that  the  observed  values  were  normally  distributed.   In  t. 
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calculation  of  the  x2  statistic  (using  equation  (9)),  8  intervals  were 
used  instead  of  10  due  to  the  interval  on  each  end  of  the  distribution  having 
an  expected  frequency  that  was  too  small.   If  3  intervals  had  been  usee, 
this  would  have  resulted  in  an  artificially  large  chi-square  value.   Thus 
the  most  extreme  intervals  were  combined  with  their  nearest  neighbor.   The 
total  x2  was  equal  to  2.3762  with  7  degrees  of  freedom,  corresponding  to 
a  P  of  about  0.9.   Therefore  Peters  and  Leiper  (1940)  concludec  that  the 
egg  counts  were  close  to  the  normal  distribution  in  form. 

The  authors  point  out  the  mean  and  variance  of  each  of  the  II  original 
series  of  counts  are  roughly  equal.   They  also  note  that  as  the  mean  of  a 
Poisson  series  becomes  large,  the  distribution  approaches  the  normal.   This 
result  has  been  shown  to  be  true  previously  in  this  report.   With  these 
thoughts  in  mind,  they  continue  on  in  an  attempt  to  show  that  the  counts 
are  Poisson  in  nature. 

The  mean  and  variance  of  each  of  the  original  11  series  of  counts  was 
plotted  with  the  mean  on  the  abscissa  and  the  variance  on  the  ordinate.   A 
linear  regression  analysis  was  conducted  resulting  in  a  graph  showing  the 
linear  regression  line  along  with  its  95%  confidence  bound  lines.   The 
expected  line  based  on  the  Poisson  distribution  for  which  the  mean  and  variance 
are  equal,  just  intercepted  the  lower  limit  line,  resulting  in  Peters  and 
Leiper  drawing  the  conclusion  that  the  series  of  counts  was  just  barely  con- 
sistent with  the  Poisson  distribution. 

These  authors  also  performed  a  regression  analysis  on  the  logarithms  of 
the  counts  due  to  the  fact  that  low  counts  were  anticipated.   Plotting  the 
log  standard  deviation  against  the  log  mean  and  then  calculating  the  regression 
line  with  its  95%  confidence  limits,  the  expected  Poisson-theory  line  was  out 


; 
. 
£  eggs,  follow .  , 

...   /  the 

.ouilie  (1952)  were  lnt  -      In  det 
es  to  count  for  eai 
.  ...       car  slide)  wfa  .    ,  counts  of  naturally 

,  They  wore  concerned  with  this  problem  becau  .       :.ie 

count  increases  as  ;.:ore  eggs  are  counted  when  dilution  ct. 

.  used.  A  related  problem  of  interest  concerned  the  optimum  ti      ;e 
before  repeating  the  v;hole  samp  lying  procecure  again.   In  order  to  i.    - 
cijate  these  two  problems  they  first  investigated  the         cion  of  ej 
counts  both  between  and  within  sheep.   Chey  found  these  :o  be  the  negati 
binomial  distribution  and  the  Poisson  distribution,  re  .   .ae 

portion  of  their  paper  deali::/.  rith  :        ^ve  binomial  distribution  will 
be  delayed  for  discussion  until  later. 

used  were  ewes  and  gimmers  (yearli:.  ieep)  fi 

various  flocks  in  various  parts  of  Scotland.   Faecal  sample 
the  rectun  of  these  sheep  and  were  counted  usin^  th   :  -stcr  sld 
authors  made  9  series  of  counts  in  all,  <:..      ch  series  consisted  of  t  - 
taining  several  counts  on  each  sheep. 

The  first  series  consisted  of  4  counts  on  cao.  of  132      .        i- 
square  statistic  was  caiculatea  usin_,  equation  (10)  c- 
counts  and  these  values  were  added  together  yielding  a  total  chi-squar 
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415. S  with  3x132  =  396  degrees  of  freedom.   P  was  greater  than  0.2  resulting 

in  the  conclusion  that  chance  variation  could  result  in  this  outco:.:e  when  the 
Poisson  distribution  was  assumed  to  be  applicable.   The  other  8  scries  of 
counts  yielded  various  chi-square  values  with  their  appropriate  decrees  of 
freedom.   The  total  chi-square  over  all  9  series  was  1,322.0  with  1,339  degrees 
of  freedom,  indicating  good  agreement  with  the  expected  value  (?  >  0.50)  and 
very  good  agreement  with  the  expected  values  in  each  series  of  counts. 

Brambell  (1963)  considers  the  extent  of  variation  of  egg  counts  taken 
from  the  same  sample,  and  taken  from  several  samples  drawn  from  the  same 
sheep  whose  host-parasite  physiological  relationship  is  unchanged  (i.e.,  given 
a  stable  host-parasite  relationship,  the  effects  of  the  time  of  day,  amount  of 
faeces  passed,  and  water  content  of  the  faeces,  can  be  investigated  with  less 
error).   In.  order  to  gain  insight  into  these  two  problems,  Brambell  makes  ob- 
servations of  egg  counts  obtained  using  the  McMaster  slide  technique  on  housed 
sheep  under  experimental  conditions  that  are  infected  with  .-  er-onchus  contortus. 

He  used  two  groups  of  sheep.   Group  I  consisted  of  seven  sneer  infected 
with  Haer.onchus  contortus ,  and  group  II  contained  4  sheep  reared  indoors  under 
worm-free  conditions.   Sixteen  faecal  samples  were  collectea  from  each  sheep 
in  group  I,  where  each  sample  consisted  of  four  chamber  counts.   Equation  (10) 
was  used  to  compute  a  x2  value  for  each  example,  each  with  three  degrees  of 
freedom.   A  chi-square  goodness-of-fit  test  was  then  perforr.ee  using  a  slightly 
different  technique  than  previous  workers  cited  in  this  report  had  used. 

Brambell' s  technique  involves  comparing  the  per  cent  of  sample  chi- 
square  values  falling  within  a  given  range  of  x2  values  in  the  chi-square 
table  against  the  expected  frequency  of  x2  values  falling  within  these 
intervals.   The  range  of  tabular  x2  values  has  a  known  expected  frequency. 


:.  ,  :or  each  range  ot  v.ilu.      X2   '  obs«r\ 

of  occurreiu  •  all  COuJ 

X2   statistic  for  cacti  interval.   A  pooled   x2   v. i  li- 
ft  x   statistic  computed  for  each  interval  i 
total  chi-square  value  of  5.63  with  6  decrees  of  freedom  (since  ue  ...id  6 
intervals).   P  was  close  to  0.30  which  supported  the  hypothesis  c 
distributed  counts.   Accordingly,  iirambell  concludes  that  the  hypothesis  of 
Peters  (1941)  that  Mc'iaster  slide  egg  counts  follow  the  Poisson  distribution 
is  confirmed. 

Brambell  (1963)  considers  his  technique  as  outlined  above  to  jc  superior 
to  methods  used  that  compare  only  one  estimate  of  variance  in  a  series  of 
counts  with  the  mean  of  the  series.   He  bases  his  argument  on  the  fact  tnat 
other  distributions  than  the  Poisson  have  the  characteristic  that  at  certain 
values  of  the  mean,  the  variance  approximates  to  the  mean.   He  reasons  that 
to  compare  only  one  estimate  of  variance  in  a  series  of  counts  with  the  mean 
of  the  series  is  thus  not  sufficient  to  distinguish  the  Poisson  series. 

Of  the  four  sheep  in  group  II  (labeled  A,  B,  C,  and  D)  sheep  A,  B,  and 
C,  had  120  samples  counted  (2  chamber  counts  per  sample),  and  sheep  D  had 
108  two  chamber  counts.   The  chi-square  statistic  was  calculated  on  each 
sheep  using  the  same  procedure  as  was  used  on  group  I  sheep.   Of  the  four 
series,  two,  (A  and  C) ,  deviated  from  the  Poisson  distribution  with  chi-square 
values  of  23.36  and  27.02  respectively,  with  6  degrees  of  freedom.   In  the 
series  from  sheep  A,  13%  of  the  counts  showed  abnormally  high  variability. 
With  sheep  C,  the  mean  was  so  low  and  the  number  of  chambers  counted  so  few 
that  the  chi-square  test  became  unreliable. 
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Many  years  earlier  Peters  (1941)  hypothesized  that  a  discrepancy  such 
as  occurred  in  sheep  A  could  be  explained  by  personal  error  on  the  part  of 
the  counter  if  the  mean  count  is  in  the  neighborhood  of  50.   The  easiest  way 
to  overcome  the  difficulties  in  the  series  associated  with  sheep  C  is  to 
count  more  chambers. 

Brambell  (1963)  gives  a  table  derived  from  the  Poisson  distribution  as 
an  aid  in  the  determination  of  the  range  of  populations  from  which  an  egg 
count  could  have  been  drawn.   For  counts  from  0  to  170  the  table  gives  the 
range  of  means  of  populations  from  which  the  given  counts  could  be  drawn 
more  often  than  once  in  twenty  times,  assuming  the  counts  are  from  a  Poisson 
distribution. 

Emik  (1947)  performed  a  chi-square  test  on  240  egg  counts  from  12 
heterogeneous  sheep  in  a  manner  similar  to  Brambell,  but  instead  of  using 
a  range  of  chi-square  values  for  each  interval,  he  used  probability  limits 
in  the  x2   tables  with  one  degree  of  freedom  under  the  assumption  of  Poisson 
distributed  counts.   He  then  counted  the  number  of  observed  x2  values  that 
fell  in  each  interval  of  P.   The  expected  number  of  x2  values  falling 
in  each  P  interval  was  calculated  directly  from  the  probability  intervals 
themselves.   The  x2   calculated  was  equal  to  6.250  with  7  degrees  of 
freedom,  yielding  a  P  of  about  0.5.   This  size  P  resulted  in  acceptance  of 
the  hypothesis  that  the  counts  are  Poisson  distributed. 
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Derivation 

Live  binomial  distribution  is  one  of  several  distributions 
proposed  to  describe  a  series  of  counts  in  which  the  variance 
icantly  larger  than  the  mean.   The  negative  binomial, 

Cq  "  P)"k 

n 
where  p  ■  f-,  and  q  ■  1  +  p,  (where  m  and   k  are  defined  below)  is  so 
k 

called  because  of  its  analogy  to  the  positive  binomial  (q  +  p)  . 

The  first  known  derivation  and  publication  of  the  distribution  was  due 
to  Montomort  in  1714.   Pascal  and  Fermat  are  also  recognized  as  having  dis- 
cussed the  distribution  (Bartko,  1961).   Student  (1907)  obtained  the  neg- 
ative binomial  when  he  observed,  while  deriving  the  Poisson  series  fron  the 
binomial,  that  two  of  his  series  gave  negative  values  for  p  and  n  yet 
fitted  the  data  very  well. 

The  distribution  is  completely  specified  by  two  parameters.   The  first 

is  the  mean,  namely  m.   The  second  is  an  index  of  over-dispersion,  denoted 

by  k.   The  nature  of  k  can  be  better  understood  if  it  is  recalled  that  in 

the  Poisson  distribution  the  mean  is  equal  to  the  variance.   In  the  negative 

2 

binomial  distribution,  however,  the  variance  is  given  by  m  +  r— •   Liote  that 

k 

as  lc  becomes  large,  the  second  term  in  the  variance  equation  tends  to  zero, 
i.e.,  the  variance  will  approach  the  mean  in  value.   Thus  the  negative  bi- 
nomial distribution  with  parameters  m  and  k,  becomes  very  much  like  the 
Poisson  with  parameter  m.   As  k  becomes  very  small,  the  variance  becomes 
very  large,  a  property  called  "over-dispersion"  (h'ortham  and  Rocha,  1958). 
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Wilks  (1943)  formally  proved  that  the  negative  binomial  approaches  the 
Poisson  as  k  ■>•  «,   In  addition,  it  is  pointed  out  by  Bliss  and  Fisher  (1953) 
that  as  k  -*■  »  and  the  number  of  units  containing  no  individuals  is  disregarded, 

the  distribution  approaches  Fisher's  logarithmic  series  (Williams,  1947). 

— k 
If  (q  -  p)  '  is  expanded,  the  probability  P  that  an  observational  unit 

X 

will  contain  x  ■  0,   1,   2,    •    .    .   individuals  is 

(k  +  x  -  1)  !      R* 
x         x!    (k-1)!  k 

q 

where  R  =  p/q  =  m/(k  +  m).  Therefore 

p  _  (k  +  x  -  1)  j  /  m  >x  -k 
x  *   x!  (k-1) !  "  Ck-ha;   q 

=  (k  +  x  -  1) !   x  -k-x  (17) 

x!  (k-1)!    P  q 

where  k  need  not  be  an  integer.   To  find  the  expected  frequency  of  units 
with  x  individuals  simply  multiply  equation  (17)  by  N,  the  total  number  of 
counts. 

If  1/q  is  taken  to  be  the  probability  of  a  "success"  and  p/q  as  the 
probability  of  a  "failure"  in  a  trial,  then  equation  (17)  can  also  be  in- 
terpreted to  be  the  probability  that  x  +  k  trials  will  be  required  to  obtain 
k  successes  (VJilks,  1943). 

Equation  (17)  is  not  the  only  form  of  the  negative  binomial  in  use. 
Bartko  (1961)  lists  the  following  two  forms  in  addition  to  the  form  given 
in  equation  (17) : 

p(x)  =  (X  +  I   "  X)  pr  qX,         x  =  0,1,2,...       (18) 

A 

where  r  and  p  are  parameters,  p  +  q  =  1,  and  r  is  an  integer; 
and 


P(X)  -  <""J)  I)"  q\  x  -  0,1,2,...       (19) 

M   r  and  p  arc      eters  vltfa  B  -  r  +  x.   U'ilks  (l',/o2)         .11 
another  lorn: 

p(x)  -  (.",)  p   q    ,  x  -  k,k+l,..., 

•M   x   is  a  random  variable  denoting  the  number  of  trials  perl  on  i 
order  to  obtain  exactly  k  "successes".   Although  these  alternative  foi 
have  many  useful  applications,  the  negative  binomial  in  the  form  of  equation 
(17)  will  be  used  in  this  report  due  to  its  frequent  use  in  egg  count  studies. 

Wilks  (1943)  shows  that  the  mean  and  variance  of  the  negative  binomial 
are  kp  and  kpq  respectively,  which  Fisher  (1941)  points  out  are  identical 
with  the  first  and  second  moments  of  the  positive  binomial  except  that  k 
corresponds  to  -n  and  q  =  1  +  p,  i.e.,  the  sign  of  p  is  changed.  Wilks 
(1943)  also  formally  proves  that  the  negative  binomial  is  an  extension  of 
the  Poisson  series  as  the  variance  of  the  negative  binomial  approaches  the 
mean  (as  was  noted  above). 

Models 

The  negative  binomial  distribution  can  arise  from  a  variety  of  bio- 
logical situations.   In  fact  it  has  generally  been  held  that  in  some  cases, 
one  can  start  from  two  or  more  mutually  incompatable  biological  hypotheses 
and  arrive  at  the  same  over-dispersed  distribution  using  deductive  reasoning. 
As  an  example  of  this,  Bliss  (1958)  states  that  one  could  assume  that  unit 
areas  are  unequally  exposed  to  infestation  and  individuals  completely  in- 
dependent of  each  other,  or  that  contagion  was  present  and  the  initial  in- 
festation uniform.  With  the  proper  definitions,  both  hypotheses  could  lead 
to  the  same  over-dispersed  distribution. 
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Anscoinbe  (1950)  and  Bliss  and  Fisher  (1953)  discuss  the  several  ways 
in  which  the  negative  binomial  may  arise.   Only  two  of  these  will  be  presented 
here  due  to  their  applicability  to  egg  count  studies.   The  reader  interested 
in  the  other  models  is  referred  to  the  above  authors. 

When  the  mean,  m,  of  a  Poisson  distribution  is  not  constant  from  trial 
to  trial  or  sample  to  sample,  then  the  counts  may  be  a  mixture  of  several 
homogeneous  Poisson  distributions.   In  such  a  mixture  of  Poissons,  the  mean 
represents  a  positive  continuous  variate.   If  m  is  distributed  according 
to  the  Eulerian  distribution  (also  known  as  the  Pearson  Type  III  distribution) 
then  the  counts  conform  to  the  negative  binomial  distribution.   Also,  if  the 
mean  degree  of  infestation  in  different  sampling  units  follows  the  log- 
arithmic distribution  the  negative  binomial  is  known  to  arise  (Hunter  and 
Quenouille,  1952). 

Estimation  of  Parameters 

The  parameters  m  and  k  are  estimated  from  the  frequency  distri- 
bution  of  a  sample  by  the  statistics  m  and  k.   The  mean  number  of  eggs 
per  count  is  estimated  efficiently  from  the  frequency  of  counts,  f,  at  each 
level  of  x,  the  number  of  eggs  per  count.  That  is, 

1  n 
x  ■  m  -  rr  I     xf  ,  x  =  0,1,2,.. .  ,n    (20) 

x=0 

where 

N  -  total  number  of  counts  taken 

and 

n  =  the  largest  number  of  eggs  observed  for  any  count. 


The  at at is tic  , 

Lhree  t<  only  uaadg  i .     .  , 

Cht  Loial  qi      of  counts  Lu 

with  no  eggs  present,  and  the  method  of  maximum  likelihood. 

j  .>'d  of  foments.   The  simplest  and  oldest  method  Oi 

2 
is  based  on  the  first  and  second  moments,  m  and  s  respectively.   Since 

2   . 
s   =  K.pq 

2 
-  n  +  f-  .  (21) 

if  equation  (21)  is  solved  for  k  we  have  the  result: 

2 

1     -      ra  _   !  /TON 

ki  "  1 F  (22) 

s  -  m    r 
where 


2 
s 


=  7fbiT  EN    E    (xfx)2  ■  (  2    xfx)2]*  (23) 

^   ;    x=0    X      x=0   X 


In  practice,  in  is  calculated  using  equation  (20)  and  replaces  m  in 
equation  (22)  to  give  k  ,  which  is  the  estimate  of  k  derived  by  the  method 
of  moments,  i.e.,  in  this  instance,  by  equating  the  variance  of  the  sample 
to  the  variance  of  the  distribution  (Anscombe,  1949). 

Fisher  (1941)  showed  that  the  estimate  of  p  by  the  method  of  moments 
is  given  by  the  equation 

2 

p  =  .  (24) 

m 

Tnis  result  is  easily  seen  by  replacing  k  by  —  in  equation  (22)  and  solving 
for  p. 
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R.  A.  Fisher  (1941)  also  derived  the  equation  of  — ,  where  E  is  the 
efficiency  of  calculating  k  by  the  method  of  moments  compared  with  k  calculated 
using  the  fully  efficient  maximum  likelihood  technique.   Anscombe  (1950) 
plots  this  efficiency  for  various  values  of  m  and  k.   In  general,  the 
method  of  moments  has  an  efficiency  of  90%  or  more  for  small  values  of  rc 
when  k/m  >  6,  for  large  values  of  m  when  k  >  13,  and  for  intermediate  values 
of  m  when  (k  +  m)  (k  +  2)/m  _>  15  (Bliss  and  Fisher  1953).   Fisher  (1941) 
states  that  if  p  is  less  than  1/9  for  any  value  of  k,  or  if  k  exceeds  18  for 
any  value  of  p,  then  high  efficiency  is  assurred.   If  the  efficiency  in  any 
particular  situation  turns  out  to  be  low,  then  a  more  exact  fitting  nay  be 
acquired  by  the  maximum  likelihood  method  which  is  presented  following  the 
ratio  method. 

The  Ratio  Method.   To  estimate  k  from  the  ratio  of  the  total  number  of 
observational  units,  (N) ,  to  the  number  of  units  with  no  eggs  present,  (f  ), 
it  is  necessary  to  note  from  equation  (17)  that  the  probability  for  x  =  0 
is  P  =  1/q  .   Replacing  P  with  the  proportion  of  empty  units  to  total  units, 
it  is  seen  that 

P  =  f  /N  =  l/qk  -  1/(1  +  m/k)k. 
o    o 


Since 


or 


or 


log(fo/N)  =  logl  -  k2log(l  +  n/k2) 


k2log(l  +  m/k  )  —log  fQ  +  log  N 


k2log(l  +  m/k2)  =  log(N/fQ),  (25) 


. 


k   di  .iiich  . 

.;re  used  is  U>  calculate  the  left  aide  ,        two 

mit  values  of   k   whose  la;  .cr  valuer  (  y  k!  ... 

k  '  ,  respectively)  give  a  product  both  Larger  and  smaller  than  the  n 
2 

side  of  equation  (25),  which  is  calculated  i row  I      pie  values 
therefore  a  constant.   The  first  approximation  of  k  (denoted  by  k!)  i 
tained  by  interpolation  between  these  two  products.   II  o: 

equation  (25)  is  then  calculated  using  k'   and  interpolation  between  k!  and 
k'  or  k'  and  k'  is  executed  depending  on  whether  the  new  product  (using  ki) 
is  larger  or  smaller  than  the  right  side  of  equation  (25).   This  final  in- 
terpolation gives  the  desired  estimate  of  k. 

If  k  is  to  be  estimated  with  an  efficiency  of  90%  or  more,  at  least 
1/3  of  the  observation  units  must  be  empty.   If  m  is  less  than  10,  enough 
empty  units  must  exist  such  that 

(m  +  0.17)  (P  -  0.32)  >  0.20  (Bliss  and  Fisher  1953). 
o 

The  '-Method  of  laxir.um  Likelihood.   It  sometimes  happens  that  k  cannot 
be  efficiently  estimated  by  the  above  techniques.   If  tnis  is  the  case,  the 
method  of  maximum  likelihood  may  be  used,  resulting  in  a  fully  efficient 
estimate  of  k. 

Haldane  (1941)  derives  the  following  maximum,  likelihood  equation: 


f  +f  +. ..+f     f  +f  +...+f  f 

H[log(fcha)-log  k]  -  -±-^_ a  +     ^ S.  +  .  .  .  +  j-j-SL-        (26) 


waere 
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f.  =  observed  frequency  in  the  i   class, 

n  =  maximum  number  of  eggs  counted  in  a  single  count, 

N  =  total  number  of  counts,  and 

m  =  mean  number  of  eggs  per  count. 
The  maximum  likelihood  estimate  of  k,  is  that  value  of  k  which  satisfies 
equation  (26)  (Edgerton,  1953).  Haldane  (1941)  points  out  that  interpolation 
becomes  easier  if  both  sides  of  equation  (26)  are  multiplied  by  k,  since  this 
causes  one  side  to  increase  with  k  while  the  other  side  decreases. 

Bliss  and  Fisher  (1953)  present  another  maximum  likelihood  technique 
for  estimating  k  but  their  method  is  fully  efficient  only  if  the  largest 
frequency  does  not  exceed  30.  The  equation 

x=0  x  i 

where  A  is  the  number  of  observations  exceeding  x,  is  calculated  using  trial 

X 

values  of  k! ,  selected  so  that  they  bracket  the  required  estimate  k  such  that 
Z.  =  zero.   Equation  (27)  is  computed  using  k' ,  which  is  usually  obtained  by 
the  method  of  moments.   If  Z  is  positive,  the  value  of  k1  is  increased 
slightly,  yielding  the  value  k'  such  that  k'  >  k' .   If  Z  is  negative,  k' 
is  taken  as  less  than  k' .   Then  k'  is  used  in  the  calculation  of  Z  and  the 
new  value  k'  is  obtained  by  interpolation  between  k'  and  k'  for  Z  =  0.   To 
increase  the  precision  of  k,  a  Z,  may  be  computed  that  is  opposite  in  sign 
from  Z_  by  selecting  k!  at  about  the  same  distance  as  k'  beyond  a  newly  in- 
terpolated k1  for  Z  =  0.   Interpolation  between  k'  and  k'  gives  the  final 
maximum  likelihood  estimate  of  k.   Anscombe  (1950)  presents  a  good  dis- 
cussion of  the  preceding  estimation  techniques  and  Bliss  and  Fisher  (1953) 
give  examples  of  all  three  methods. 
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e  Trans  :  .  :   :  ..  ..   ■  i  -■>..   An  .1 1 

1950)  is  L>.' 

a  k)  Of  the  COUDtl  EC  •»  BOM  V 
..>  1  tndfl  only  on  k.  ka  Of  k  is  then  obtained  by  calcu- 

lating the  sample  variance  of  y  and  equating  it  to  tht  oxpoctod  v 
I  process  is  then  repeated  if  the  new  value  of  k  is  muck  differ 
1...    old  one.   Ansconbe  (1949)  suggests  using  the  transfomation 

y  -  log1Q[r  +|]  (20) 

where   r  is  an  actual  count,  if  o  _>  15  and  if  2  _<_  k  <_   5.   If  2  >  I;  >  5, 
then  equation  (28)  may  still  be  used  but  only  if  m   is  sufficiently  lat 
Under  these  conditions,  the  expected  variance  of  y  is  approximately  inde- 
pendent of  m  and  is  equal  to  0.1886^' (k)>  where  ^' (k)  is  the  second 
derivative  of  lnT(k)  with  respect  to  k. 
The  equation 

y  =  Sinh"1  / (29) 


m 


2c 
may  be  used  if  k  _>  2.   The  constant  c  is  equal  to  0.375  if  k  is  large 

and  0.2  when  k  =  2.   The  expected  variance  of  y   usin;j  this  transformation 

is  0.25^'(k).   The  mean,  m,  nay  be  as  small  as  4  or  5  (Anscombe,  1949). 

One  important  characteristic  of  t'ae  negative  binomial  must  be  noted. 

Since  the  distribution  may  be  very  skewed,  confidence  limits  and  standard 

errors  should  be  calculated  initially  in  terms  of  1/k,  which  has  a  relatively 

symmetrical  distribution. 
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Variance  of  x  and  k 

The  variance  of  x  is 

,  2 

V(x)   =  ±[m  +  f-]  (30) 

Li  K. 

A 

where     m     and     k     are  necessarily   replaced  by  x  and  k. 

A  A 

The  variance  of  k  depends  on  the  method  by  which  k  is  estimated.   If 
equation  (22)   is  used  to  estimate  k,  its  large  sample  variance  is 

A    A 

v(k  )  =  2k^+i]  (31) 

NR 
where 

R  =  ^r-2—   .  (32) 

k  +  x 

a 

If  k  is  calculated  using  equation  (25)  then  the  appropriate  large  sample 

variance  is  given  by 

-k 
V(k2)  =  -£iz£J *  X  "  kR  (33) 

N[-lne  (1-R)  -  R]2 

A  A 

where  R  is  as  defined  in  equation  (32).   The  variance  of  k,  where  k  is 
calculated  using  the  method  of  maximum  likelihood  (equation  (27)),  is  given 

by  the  ratio 

a    k._  -  k, 

V(k)  =  -f f-  (34) 

Z4  "  Z3 

where  Z„     and  Z,  are  the  two  values  of  Z.  just  above  and  below  zero  and 
3       4  i 


calculated  using  k'  and  k'. 


13 


•  ves  Ch<  calcul 

k   is 


(k+l)2nj 

,  ' (k)«««  denote  th<  «  rivatives  of  lnr(k).  Equation 

(35)  La  accurate  for  pract  Leal  purposes  .according  to  Anscombe  if  ra  >  50 

id  assumin;         ropriate  hyperbolic  sine  transformation  is  used. 

.3s  of  Fit  Tests 

rhe  chi-square  test  is  used  to  test  the  adequacy  of  the  negative  bi- 
nomial distribution  in  fitting  a  series  of  counts  much  the  same  way  as  the 
Poisson  distribution  is  tested  Lor  goodness  of  fit. 

iSt  1 .   Bliss  and  Fisher  (1953)  presented  the  following  procedure: 

1).   Compute  the  expected  frequencies  using  equation  (17).   Start  with 
the  number  expected  at  x  =  0,  which  is  <J>  =  N'/q  , 

2).   Find  the  expected  frequencies  for  x  =  1,2,3,...  by  using  the 
relationship 

=  Hchc-ll   . 

X      x         x-1 

3) .   Avoid  accumulating  rounding  errors  by  retaining  more  decimals 
in  the  calculator  than  necessary. 

A).   Pool  the  frequencies  with  small  expectations  so  that  no  expectation 
is  less  than  5. 

5).   Compute 
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n    (f  -  6  )' 
x2  =  V    _i x- 

X=0         YX 


(37) 


where   f   is  the  observed  frequency  for  each  x,  and  the  degrees  of  freedom 
are  (n  -  3)  ,  i.e.,  three  less  than  the  number  of  ratios  summed.   If  x2   is 
small  and  m  and  k  are  efficient  estimates,  probably  no  other  test  is 
needed. 

Other  tests  have  been  proposed  by  Anscombe  (1950)  and  have  been  dis- 
cussed and  illustrated  by  Bliss  and  Fisher  (1953).   Two  tests  in  particular 
are  described  that  are  applicable  when  N  is  large,  each  being  based  on  the 
difference  between  the  observed  and  the  expected  second  and  third  moments 
of  the  negative  binomial.   These  tests  are  not  influenced  by  chance  ir- 
regularities in  the  observed  frequencies  and  furthermore,  the  few  large 
values  in  the  tails  of  the  observed  distribution  are  not  ignored. 

Test  2«   Using  the  method  of  moments  (equation  (22)),  estimate  the 
first  two  moments  of  the  sample  from  a  negative  binomial  distribution  and 
find  T,  the  difference  betx^een  the  third  sample  moment  and  the  expected 
third  sample  moment  predicted  from  the  first  two  moments  of  the  same  sample. 
That  is 


.1*2.  . 


II 


pq[q  +  p]m 


=  ^~     -  S2[%^-  1] 


m 


n 


I  [f  (x  -  m)JJ 

x=0     X 


2r2s    .. 
-  s  [—X 1] 


m 


1 

M 


n      _     *  n      -     *9  n 
I      [f y)    -  3m  I      [f  xZ]  +  2i/  I      [f  x] 
x=0   x        x=0  x=0 


-  s2[%--  1],(38) 


n 


IS 


2 

.;!•     calculated    uaiiiK   i<;u.iLions    (20)    and    (23)    re 

cant  difference  thi  obaarvad  end  ;<-d 

aed  i>y  compering     I    with   Li        tandard  ,   I 

root  of  its  lai  ,   v(T),  where 

V(T)    =  |[2n   (k  +  l)p2q2  $2(3  +  5p)  +  3kq  j    J,  (39) 

C6      p   ■  m/k,    c:   ■   1  +  p,    and     k      is   the  maximum   likelihood   est  li. ate    if 
Liable    (Anacombe    (1950)   and  Blisa    (1953)). 

Test    J_.      Compute    tlie   observed   second  moment   and   subtract    from  it    its 
expectation.      That   is,    calculate    U,   where 

U  -  s  -  (m  +  t-)  (40) 

k2 

A 

where  k„   is  calculated  using  the  ratio  method  (equation  (25)).   U  has  the 
large  sample  variance 


V(U)  =  2m[k+l]  pq2    [l  -  _ln[i_R3.R   /  [N  +  pAV(k2)] 

A 

where  V(k  )  is  defined  by  equation  (33)  but  computed  with  the  maximum 

A 

likelihood  estimate  of  k  if  it  is  available,  as  are  the  other  terms  in 
equation  (41). 

It  should  be  noted  that  the  expected  moments  in  tests  2  and  3  are 
confuted  using  other  than  maximum  likelihood  estimates.   Bliss  and  Fisher 
(1953)  state  that  this  is  the  procedure  because  V(T)  and  V(U)  are  derived 

A 

assuming  T  and  U  are  calculated  with  a  k  estimated  using  some  method  other 
than  maximum  likelihood.  Thus  V(T)  and  V(U)  are  of  doubtful  applicability 
unless  the  procedure  given  above  is  followed. 


(41) 
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Transformations 

There  have  been  two  transformations  proposed  for  stabilizing  the 
variance  of  negative  binomial  distributions.   One  of  these  is  the  log- 
arithmic transformation  given  by 

y  =  log  [*  +  y].  (42) 

Beall  (1954)  suggests  using  the  more  sophisticated  transformation  given  by 

y  =  4T  sinh"1  fx       ,  (43) 

which  has  been  tabled  for  various  values  of  x  and  1/k  ranging  from  0  to  1. 
In  both  of  these  transformations,  k  is  actually  a  common  k  as  defined  below 
if  a  common  k  is  available  (Northma  and  Rocha,  1958;  Bliss  and  Owen,  1958). 
These  transformations,  in  addition  to  making  the  variance  independent  of  the 
mean,  also  tend  to  make  the  transformed  scores  normal  in  distribution,  and 
the  real  effects  additive.   An  analysis  of  variance  can  be  computed  on  the 
transformed  counts  and  F  tests  performed  since  the  assumptions  underlying 
such  tests  are  more  nearly  met. 

Calculation  of  a  Common  k 

Several  authors,  for  example,  Anscombe  (1949  and  1950),  Bliss  and 
Fisher  (1953),  Bliss  and  Owen  (1953),  and  Bliss  (1953),  have  discussed  the 
calculation  of  a  common  k  among  several  negative  binomial  distributions. 
Bliss  and  Owen  (1953)  state  that  since  observed  counts  are  often  compared 
in  terms  of  their  means,  a  stable  k  would  simplify  such  comparisons 
materially.   Bliss  and  Fisher  (1953)  see  an  advantage  in  a  common  k  in  that 


increase:-  . . .ty  o:  . 

L 1  i  c  y  l  o  i  ..... 

follow  a 

..  k,  win  n    thods  1,  2,  and  3,        td  on  equal  ions  (22)  ,  (2  ■;  , 
and  the  transformation  method  for  esti.       k,  respectively^ 

1).   Guess  a  value  of  k  and  calculate  T  for  each  set  of  count.  , 
i.e.,  for  each  negative  binomial  distribution,  where 

[;:-l]s2  -  [N-l-  km[l   +  - 

T  ■  : ■ *-  ,  (44) 

[m  +  lc]^ 

2 
where   N   is  the  number  of  observation  units  counted,  and  s   and  m  are 

defined  in  equations  (20)  and  (23)  respectively.   The  common  k  is  tuat 

value  of  k  for  which  the  sum  of  the  expressions  T,  over  all  sets  of  counts, 

is  zero.   N  should  be  at  least  10. 

2).   Guess  a  value  of  k  and  calculate  U  for  each  set  of  counts,  where 


U  =  log[l  +?] 


o        k 


-•■[^l] 


2[m  +  k]  J 


(45) 


where  n  is  the  number  of  observation  units  that  are  empty.  As  in  the  first 
o  •  J 

method,  the  object  is  to  use  a  value  of  k  such  that  the  sum  of  all  U  over 
all  series  of  counts  is  zero.   Again,  N  is  assumed  to  be  at  least  10,   Tor 
details  on  the  derivation  of  equation  (44)  and  (45)  see  Anscombe  (1950). 

3).   Calculate  the  variance  of  the  transformed  variate  y   (where  y 
is  defined  by  equation  (28)  or  (29))  for  each  set  of  counts,  pool  the  answers 
and  equate  to  the  theoretical  variance,   li  may  be  as  small  as  two  in  this 
case.   The  restrictions  on  m  and  k  for  an  appropriate  transformation  to 
exist  must  be  observed  as  mentioned  following  equations  (28)  and  (29). 
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Bliss  and  Fisher  (1953) ,  by  an  expansion  of  the  maximum  likelihood 
method  of  estimating  k,  present  a  method  of  calculating  a  common  k  (which 
he  denotes  by  k  )  from  several  series  of  distributions.   Equation  (27)  is 
used  to  compute  the  score  Z  for  each  component  distribution  with  the  same 
trial  values  of  k'.  These  values  of  Z  are  added  over  all  component  dis- 
tributions for  each  k1  trial  value  resulting  in  suns  S(Z).  Different  trial 
values  of  k'  are  used  until  two  of  the  sums  are  obtained  which  bracket  zero 
very  closely.   By  interpolation  between  these  two  sums,  the  required 
k  is  that  value  of  k'  for  which  S(Z)  equals  zero.   Denoting  these  two 
sums  by  S(Z«)  and  S(Z.)  from  corresponding  values  of  k'  and  k',  the  error 
variance  of  k  can  be  calculated  using  equation  (34)  if  Z  and  Z.  of  that 
equation  are  replaced  by  S(Z  )  and  S(Z  ). 

Bliss  and  Fisher  (1953)  also  show  how  to  test  for  homogeneity  of  k 

values  over  the  component  distributions  by  use  of  x2«   That  is,  do  the 

values  of  k  from  the  component  distributions  differ  significantly.   If 

they  do  not  then  a  common  k  can  be  derived.   For  each  negative  binomial 

distribution  Z_  and  Z.  are  calculated  using  k'  and  k! •   Then  for  each 
3      4  3      4 

distribution,  the  ratio 

3  7  4  7  3  (46) 

Z3  "  4 
is  computed.   From  the  sum  of  all  of  these  ratios  over  all  distributions  for 

which  the  ratio  was  computed  is  subtracted  the  ratio  (equation  (46))  cal- 
culated using  the  sums  S(Z.)  and  S(Z.)  in  place  of  Z.  and  Z.  for  k'   and 

3        4  3      4      3 

k'  respectively.   The  difference  is  distributed  as  a  chi-square  with  (g-1) 
degrees  of  freedom,  where  g  is  the  number  of  distributions  being  con- 
sidered.  If  the  difference  between  the  k's  is  not  significant,  then  a 
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cributlona  may  be  dlscov*  Lnveetlj 

1st  rib  ut 

(1958)  developed  flat  .      cthod  of  calcul 

klch  is  an  ax  tension  of.  Anacombes1  ate  In 

of   regression   and   small  series.      For  cacn  :  ive  binomial  :;, 

the   quantities    x'    and  y'    are    computed,   where 

9  2 

x'    -   xZ   -  f^  (47) 

and 

y*    =  s2  -  u   . 

If  y'   is  plotted  against  x' ,  where  x1  is  on  the  abscissa,  the  line  fitting 

these  points  and  passing  through  the  origin  has  the  slope  b  =  1/k  .   Since 

the  increase  in  the  variability  of  y'  is  often  roughly  proportional  to  l 

increase  in  x'  ,  a  first  estimate  of  1/k  is 

c 

i     1*' 

r-77  •  <49) 

c   2.x' 

where  the  summation  extends  over  all  component  distributions. 

It  is  possible  to  determine  whether  a  common  k  holds  over  all  dis- 
tributions by  calculating 

1    v' 

for  each  distribution  and  plotting  it  against  its  mean,  x.   If  1/k  does 
not  consistently  increase  or  decrease  with  x,  and  if  there  is  no  distinct 
clustering  of  points,  a  common  k  may  be  fitted. 
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A  more  accurate  method  for  fitting  k  has  been  developed  (Bliss  and 
Owen,  1958).  For  each  negative  binomial  distribution  in  the  set  of  dis- 
tributions for  which  a  common  k  is  to  be  fitted,  calculate  the  following 

-   2  2 

quantities:  x,  s  ,  x' ,  y' ,  y'/x',  (m  +  k1)  ,  wx1,  and  wx'y',  where 

x  =  observed  mean  given  by  equation  (20) , 

2 
s  =  observed  variance  given  by  equation  (23) , 

x'  =  a  statistic  given  by  equation  (47), 

y'  =  a  statistic  given,  by  equation  (48) , 

k1  -  jy/jy,  (5D 

where  the  summation  extends  over  all  distributions, 


and 


--  °-5^  '  1'k'A r       •     — ^ r  (52) 


kf[k'+l]   -    [2kf-l]/N  -  3/N2  xf [m  +  k']2 


A 


x' [m+k']Z 

where     A     is   a  constant   for  each  distribution  if  N  is   constant   from  dis- 
tribution to  distribution.      The  quantity  k     is   then  given  by 

£[wx'2] 
k     =  (53) 

where  the  summation  is  over  all  distributions.   If  k   differs  appreciably 


from 


2 
k' ,  the  values  of  (m  +  k1)   and  wx'  are  recalculated  using  k  instead 


of  k1.  Equation  (51)  is  a  good  first  estimate  of  k  if  the  quantities  v/x' 
are  relatively  stable  over  all  distributions. 

Instead  of  calculating  w  for  each  distribution,  the  product  wx'  may 
be  obtained  by  using  the  equation 
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•  ■ r 

OOtla  y '-:.'/.  ro 

2 

.')   -  b  |  and  B(y  )  ■  m  /k).     Tha   re<  of 

this  difference  is  Che  wei,;ht  w  (BliSl  tad  Owen,  1958)  • 

kt   X   test  may  be  used  here  also  to  test  v/hether  a  series  of  .      .\- 
binomials  distributions  has  a  common  k.   lac  procedure  is  to  calculate  t 
quantity  wy'  for  each  distribution.   This  is  most  easily  done  by  £irL,t  D  - 
puting  y'/x'  and  computing 

(vx'Hy'/x')  =  wy'   .  (55) 

The  products  of    (y')(wy')    are  summed  over  all  distributions   to  obtain 
/.[wy1    ].      The  quantity 

J>y'2]    -    [I(wx,y,)32/    £(wx'2)  (56) 

is  distributed  approximately  as  a  x2  with  (g  -  2)  degrees  of  freedom,  where 
g  is  the  number  of  distributions.   An  additional  degree  of  freedom  is 
lost  due  to  the  second  terra  of  expression  (56)  being  the  slope  of  the  line 
with  a  zero  intercept.   It  is  important  that  k  and  k'  agree  closely  before 
column  wy1  is  calculated  because  this  chi-square  test  is  sensitive  to  any 
discrepancy  between  the  two. 

Bliss  and  Owen  (1958)  also  describe  a  method  of  calculating  a  common 
k  that  is  useful  when  a  k  is  necessary  in  transforming  negative  binomial 
counts  so  that  an  analysis  of  variance  may  be  computed. 
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Egg  Count  Studies 

There  have  not  been  many  papers  written  describing  egg  count  studies 
where  the  counts  follow  the  negative  binomial  distribution.   Kost  egg  count 
studies  are  concerned  with  replicate  counts  from  the  same  animal  and  these 
counts  are  usually  Poisson  distributed.   Two  papers  will  be  discussed  here 
that  involve  the  negative  binomial  distribution. 

Hunter  and  Quenouille  (1952)  took  egg  counts  on  two  different  breeds 
of  sheep  from  three  counties  during  the  months  of  June,  January,  and  July 
in  Scotland,  and  found  some  sheep  tend  to  give  much  higher  counts  than  would 
arise  in  a  random  distribution,  possibly  due  to  different  levels  of  resistence 
to  the  parasites.   They  believe  that  the  negative  binomial  might  be  appropriate 
because  the  following  two  points  were  reasonable  assumptions  to  make  in  light 
of  the  egg  counts  they  observed: 

1).   Counts  from  the  same  sheep  follow  a  Poisson  distribution. 

2).   The  quantity  m  in  different  animals  follows  the  logarithmic  dis- 
tribution. 

Hunter  and  Quenouille  (1952)  fitted  13  series  of  observations  to  the 
negative  binomial  by  computing  the  expected  number  of  counts  for  x  =  0,1,2,... 
and  comparing  these  with  the  observed  frequencies.   The  number  of  sheep  in 
these  series  ranged  from  49  to  90.   A  x2   test  was  computed  for  each  series, 
yielding  a  total  x2  value  over  all  13  series  of  38.35  with  34  degrees  of 
freedom  resulting  in  a  P  >  0.25.   They  conclude  therefore  that  the  negative 
binomial  is  applicable  to  these  counts. 

The  authors  point  out  that  a  knowledge  of  p  and  k  provides  a  con- 
venient summary  of  any  series  of  counts.  They  state  that  p  usually  is  more 
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sei.  os    in    the   i.ic.m    co.. 

.  :i   couiu  ....  tad     k      « 

cir.  .     ......        Chi    1  ■  LM    ana    found    that 

L8.3  +  4.19    t<       •        +  J.'.U)  :ly   to  :cnt    J. 

output,    .uiu  k    remained   fairly   uniform,    varyiu;-.    from    •  .■•      +0.61   to 

.  .2  +  U.12.      The   authors  Mention   that   the   use   of   k   alio  on 

oi   distributions   of  egg   counts   under   differing   conditions   and   at    d 
tir.es    to   determine   how  uniformly   the  worms   are   distributed   under   these 
varying   conditions. 

[he    relative  efficiency  of   different  size  samples   in   terms  of   the 
variance  is   also   investigated.      The   equation 


ML+Jd  x  100%  =  JL+A  x  ioo%  (57) 


kp[7+p]  7+p 

is  derived  for  this  purpose,  where  kp(l  +  p)  is  the  variance  of  a  negative 
binomial  distribution  for  a  particular  sample  size,  and  kp(l/r  +  p)  is  the 
variance  per  unit  sample  of  a  sample  r  tiir.es  as  large.   Thus,  once  p   is 
known  for  a  particular  sample  size,  the  relative  efficiency  of  various 
multiples  of  this  sample  size  may  be  computed. 

Hunter  and  Quenouille  (1952)  also  use  the  expression 


102%  (58) 


l-^ 

m 
to  calculate  the  relative  efficiency  in  terms  of  the  possible  efficiency 
obtainable  with  a  sample  of  unlimited  size.  2y   using  various  values  of  i 
and  k  in  equation  (58)  they  conclude  that  for  large  k  (over  0.6)  it  is  ad- 
visable to  take  a  sample  large  enouga  to  make  the  mean  number  of  slices  counted 
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per  sheep  about  four.   For  values  of  k  between  0.4  and  0.7  a  mean  of  2 
slides  is  sufficient. 

The  problem  of  how  much  time  should  lapse  between  repetition  of  the 
whole  samplying  procedure  is  also  discussed.  The  percentage  gain  in  in- 
formation from  taking  a  second  set  of  samples  n  days  after  the  first  is 

..    k     n 
H p 

100  x  r —  (59) 

1  +  -  +  p 

m    n 

where  p   is  the  correlation  between  the  infestations  n  days  anart. 
n 

After  examination  of  this  expression  for  various  values  of  k/m  and  p  , 

n 

Hunter  and  Quenouille  conclude  that  weekly  sampling  is  adequate  for  most 
practical  purposes. 

These  two  authors  conclude  by  pointing  out  that  a  comparison  of  egg 
counts  between  two  flocks  of  sheep  should  take  into  account  the  parameter  k, 
since  the  dispersion  of  eggs  throughout  the  population  is  just  as  important 
as  the  number  of  parasites.   The  question  of  whether  a  high  or  low  k  is 
most  beneficial  to  the  parasite  is  unknown.   In  more  practical  terms,  this 
could  be  stated  as  Hunter  and  Quenouille  do;   "In  other  words  whether  eggs 
dropped  on  pasture  would  survive  better  if  isolated  in  a  few  large  groups 
(low  k)  than  in  more  evenly  dispersed  smaller  groups  (high  k)",  is  unknown. 

Hortham  and  Rocha  (1958)  did  a  statistical  study  on  the  worm  burden  of 
chickens  and  showed  that  the  distribution  of  the  worm  counts  followed  the 
negative  binomial  law.   They  present  4  large  frequency  distributions  of  counts 
of  Ascarida  %alli  in  experimentally  infected  chickens.   They  assumed  that 

1).   the  frequency  distribution  of  the  number  of  worms  per  bird  fol- 
lows the  Poisson  series,  provided  the  chickens  have  the  same  genetic  re- 
sistence  and  that  external  factors  are  controlled,  and 


2)  .  I  iluv.:.  thl 

tea  v.. i  Lea  I  coi  <  m 

. 

orth        li a  (1958) 

I  y  citing  three  :icricj  of  worn  counts  pi 
subdivided  into  3  parts;  :r.ule,  female,  and  COI  ... 

over  all  3  series  were  taken  fron  the  same  inbred  line  in  an  attempt  to  cc<;  - 
tr^     aetic  resistence  variability.   Each  distribution  for  male, 

combined,  was  shown  to  follow  the  Poisson  lav;.   They  stress  the  i 
of  controlling  external  factors  in  arriving  at  a  Poisson  distribution  by  cit> 
another  experiment  similar  to  the  one  above,  except  that  several  illnesses  oc- 
curred, the  result  being  over-dispersion  and  non-Poisson  distributions  of  worm 
counts.   The  authors  do  not  test  assumption  two  directly. 

For  each  of  the  4  distributions,  they  calculate  the  ncan,  standard  cevi- 
ation,  standard  error  of  the  mean,  and  3  estimates  of  k  by  the  method  of 
maximum  likelihood,  method  of  moments,  and  the  proportion  of  birds  with  no 

s  to  the  total  number  of  birds  investigated.   They  note  that  the  4  dis- 
tributions do  not  have  a  common  value  of  k.   A  x   test  was  performed  on 
each  distribution  with  resulting  values  of  p  ranging  from  0.10  to  0.25 
thus  indicating  the  negative  binomial  distribution  adequately  described 
the  distributions. 

Tne  fact  is  noted  that  with  the  values  of  k  and  m     they  calculated, 
the  relative  efficiency  of  the  method  of  moments  as  compared  with  the 
maximum  likelihood  approach  is  around  50%,  while  the  ratio  method  is  somewhat 
better,  giving  a  relative  efficiency  between  75%  and  90%.  Northam  and  Rocha 
(1958)  continue  on  and  apply  these  statistical  results  to  an  experiment  by 
Rocha  in  1955  using  the  drug  Phenothiazine. 
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CONCLUSIONS 

The  preceding  discussion  has  centered  around  the  Poisson  and  the  negative 
binoraial  distributions  with  a  consideration  of  many  of  their  properties  such 
as  derivations,  parameter  estimation  procedures,  transformations,  and  ap- 
plicability to  egg  and  worm  count  studies. 

It  was  shown  that  replicate  worm  and  egg  counts  obtained  by  the  use  of 
a  dilution  technique  (usually  the  McMaster  slide)  are  distributed  according 
to  the  Poisson  distribution.  This  distribution  was  not  too  difficult  to  work 
with  due  to  the  equality  of  the  mean  and  variance,  and  the  property  of  conver- 
gence to  the  normal  distribution  with  large  m. 

The  negative  binomial  distribution  is  not  as  simple  to  work  with  as  the 
Poisson.   This  is  due  to  the  fact  that  two  parameters  must  be  specified  for 
each  negative  binomial  distribution,  namely  m  and  k,  where  k  is  a  measure  of 
over-dispersion,  i.e.,  a  measure  of  how  much  variability  is  present  by- 
yond  what  we  could  expect  in  a  Poisson  series. 

Perhaps  two  of  the  more  important  techniques  presented  under  the  negative 
binomial  section  were  the  calculation  of  a  common  k  and  the  calculation  of 
test  statistics  used  to  determine  if  the  common  k  calculated  is  valid  over 
all  component  negative  binomial  distributions.   In  sheep  studies  for  example, 
an  experimenter  may  be  interested  in  determining  whether  two  negative  binomial 
distributions  fitted  to  different  egg  counts  are  really  significantly  dif- 
ferent from  each  other.   A  comparison  of  means  between  the  two  distributions 
in  an  attempt  to  answer  this  question  is  much  more  meaningful  if  it  can  be 
shown  that  a  common  k  can  be  calculated. 

As  a  concluding  remark,  it  should  be  noted  that  the  negative  binomial 
is  not  the  only  over-dispersed  distribution,  although  it  is  the  easiest  to 
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il  distributions  such  aa   the  Poisson  Pascal  (Hatti  and  6ui 
1961),  the  Conditional  Poisson  (Cohen  Jr.,  I960),  and  tiie  truncat.        .  ve 
binor^Lal  (Bartko,  1961;  Sar.pford,  1955)  are  also  discussed  in  the  literature, 
but  are  of  only  secondary  interest  to  the  topic  of  this  report. 
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The  parasitologist  is  often  interested  in  the  frequency  distribution  of 
egg  counts  of  internal  parasites  found  in  various  animals.   If  such  a  dis- 
tribution is  known  the  parasitologist  can  predict  more  accurately  the  number 
of  eggs  expected  in  a  future  count  from  the  same  population.   His  interest 
in  egg  counts  stems  from  his  ability  to  obtain  an  accurate  worm  count  at 
necropsy  only. 

Two  statistical  distributions  have  been  shown  to  be  of  major  importance 
in  egg  count  studies.   The  first  of  these  is  the  Poisson  distribution  which 
is  applicable  when  a  dilution  technique  in  counting  is  used  (such  as  the 
HcIIaster  slide  method)  and  when  the  counts  are  taken  from  a  homogeneous  pop- 
ulation, such  as  repeated  sampling  from  the  host. 

Various  workers  such  as  Peters,  Leiper,  Emik,  Hunter  and  Quenouille, 
and  Brambell  have  performed  egg  count  studies  on  sheep  and  have  found  the 
Poisson  distribution  to  adequately  fit  the  egg  counts  obtained. 

The  mean  and  variance  of  a  Poisson  distribution  are  equal,  resulting 
in  only  one  parameter  having  to  be  estimated,  namely  the  mean.   In  the  event 
that  an  analysis  of  variance  is  going  to  be  calculated  using  Poisson  dis- 
tributed counts  a  transformation  is  usually  required.   The  transformation 
applied  to  Poisson  distributed  data  is  usually  the  square  root,  v'x"  ,  or 
/x+l/2,  depending  on  the  size  of  the  count. 

The  chi-square  test  may  be  used  to  test;  (1)  does  a  given  series  of 
counts  actually  follow  the  Poisson  distribution,  (2)  is  the  mean  of  a 
Poisson  greater  than  a  particular  value  and  (3)  do  several  Poisson  distri- 
butions have  the  same  mean.   For  two  Poisson  distributions  with  large  means, 
the  t  statistic  may  be  used  to  test  for  significant  differences  between 
the  means  of  the  distributions. 


Oimt  Of  v.i: 

.  ... 
this  greater  variability ,  tae  Poisson  distribution  is  not  . ibe 

such  counts.   Several  distributions  have  been  pi.  .e  cou:. 

Lest  to  work  With  and  the  nost  applicable  is  the  negative  I      .1. 

The  negative  binomial  can  arise  from  ;;iany  different  biological  circe  - 
stances,  but  eg;;  count  distributions  seen  mainly  to  arise  fron  two  mode ■'.     j 
one  where  u   is  not  constant  from  one  Poisson  distribution  to  the  next  and  I  , 
in  fact,  distributed  according  to  the  Tearson  type  III  curve,  r, 

where  m  follows  the  logarithmic  distribution  from  one  Poisson  to  another. 

The  negative  binomial  distribution  is  specified  by  two  parameters,  the 
.,  m,  and  an  index  of  over-dispersion,  k.   The  parameter  k  has  been  cal- 
culated using  at  least  four  different  techniques;  (1)  Che  method  of  moments, 
(2)  the  ratio  of  the  total  number  of  counts  to  the  i      of  counts  with 
no  eggs  present,  (3)  the  method  of  maximum  likelihood  and  (4)  tne  trans- 
formation method.   The  maximum  likelihood  approach  is  fully  efficient  while 
the  other  methods  are  not.   Their  efficiency  can  usually  be  calculated 
however.   The  variances  of  x  and  k  can  also  be  found. 

The  chi-scuare  statistic  may  be  used  to  test  for  goodness  of  fit  of  the 
negative  binomial.   Tests  using  the  difference  between  the  observed  and 
expected  second  and  third  moments  are  applicable  for  this  purpose  also. 

Two  transformations  are  commonly  used  to  stabilize  the  variance  of  a 
negative  binomial  distribution.   One  is  the  logarithmic  transformation  de- 
scribed by  Ansconbe  and  the  other  was  suggested  by  Beall  and  involves  sinh   . 

A  stable  k  between  several  negative  binomial  distributions  is  oft 
advantageous  (especially  in  comparing  means).  Accordingly,  several  meth 
have  been  developed  for  finding  a  common  k.   Anscombe  discusses  three  sue!. 


methods  based  on  methods  for  calculating  a  single  k.   Bliss  uses  an  extension 
of  the  maximum  likelihood  technique,  and  Bliss  and  Owen  present  a  method  based 
on  a  weighted  moment  estimate  in  terms  of  regression.   Various  x2   tests  are 
available  for  determining  whether  a  common  k  exists  between  several  distri- 
butions. 

Both  egg  and  worm  count  studies  have  produced  counts  distributed  ac- 
cording to  the  negative  binomial  as  shown  by  Hunter  and  Quenouille,  Egerton, 
Egerton  and  Hansen,  and  Northam  and  Rocha. 


